The Query Containment Problem: Set Semantics vs. Bag Semantics
نویسنده
چکیده
Query containment is a fundamental algorithmic task in database query processing and optimization. Under set semantics, the query-containment problem for conjunctive queries has long been known to be NP-complete. SQL queries, however, are typically evaluated under bag semantics and return multisets as answers, since duplicates are not eliminated unless explicitly specified. The exact complexity of the query-containment problem for conjunctive queries under bag semantics has been an outstanding and rather poorly understood open problem for twenty years. In fact, to this date, it is not even known whether conjunctive-query containment under bag semantics is decidable. The goal of this talk is to draw attention to this fascinating problem by presenting a comprehensive overview of old and not-so-old results about the complexity of the query-containment problem for conjunctive queries and their variants, under both set semantics and bag semantics.
منابع مشابه
Combined-Semantics Equivalence Is Decidable for a Practical Class of Conjunctive Queries
The problems of query containment and equivalence arefundamental problems in the context of query processingand optimization. In their classic work [2] published in1977, Chandra and Merlin solved the two problems for the language of conjunctive queries (CQ queries) on relationaldata, under the “set-semantics” assumption for query evalu-ation. Alternative semantics, called ba...
متن کاملEquivalence and Minimization of Conjunctive Queries
The problems of query containment, equivalence, and minimization are fundamental problems in the context of query processing and optimization. In their classic work [2] published in 1977, Chandra and Merlin solved the three problems for the language of conjunctive queries (CQ queries) on relational data, under the “set-semantics” assumption for query evaluation. While the results of [2] have be...
متن کاملDesigning Views to Optimize Real Queries
This paper considers the following problem: given a query workload, a database, and a set of constraints, design a set of views that give equivalent rewritings of the workload queries and globally minimize the evaluation costs of the workload on the database under the constraints. We refer to this problem as “view design for query performance,” or “view design” for short; sets of views that sat...
متن کاملDatalog: Bag Semantics via Set Semantics
Duplicates in data management are common and problematic. In this work, we present a translation of Datalog under bag semantics into a well-behaved extension of Datalog (the so-called warded Datalog) under set semantics. From a theoretical point of view, this allows us to reason on bag semantics by making use of the well-established theoretical foundations of set semantics. From a practical poi...
متن کاملContainment of Relational Queries with Annotation Propagation
We study the problem of determining whether a query is contained in another when queries can carry along annotations from source data. We say that a query is annotation-contained in another if the annotated output of the former is contained in the latter on every possible annotated input databases. We study the relationship between query containment and annotation-containment and show that anno...
متن کامل